8 research outputs found

    Inferring hidden features in the Internet (PhD thesis)

    Full text link
    The Internet is a large-scale decentralized system that is composed of thousands of independent networks. In this system, there are two main components, interdomain routing and traffic, that are vital inputs for many tasks such as traffic engineering, security, and business intelligence. However, due to the decentralized structure of the Internet, global knowledge of both interdomain routing and traffic is hard to come by. In this dissertation, we address a set of statistical inference problems with the goal of extending the knowledge of the interdomain-level Internet. In the first part of this dissertation we investigate the relationship between the interdomain topology and an individual network’s inference ability. We first frame the questions through abstract analysis of idealized topologies, and then use actual routing measurements and topologies to study the ability of real networks to infer traffic flows. In the second part, we study the ability of networks to identify which paths flow through their network. We first discuss that answering this question is surprisingly hard due to the design of interdomain routing systems where each network can learn only a limited set of routes. Therefore, network operators have to rely on observed traffic. However, observed traffic can only identify that a particular route passes through its network but not that a route does not pass through its network. In order to solve the routing inference problem, we propose a nonparametric inference technique that works quite accurately. The key idea behind our technique is measuring the distances between destinations. In order to accomplish that, we define a metric called Routing State Distance (RSD) to measure distances in terms of routing similarity. Finally, in the third part, we study our new metric, RSD in detail. Using RSD we address an important and difficult problem of characterizing the set of paths between networks. The collection of the paths across networks is a great source to understand important phenomena in the Internet as path selections are driven by the economic and performance considerations of the networks. We show that RSD has a number of appealing properties that can discover these hidden phenomena

    Describing and Forecasting Video Access Patterns

    Full text link
    Computer systems are increasingly driven by workloads that reflect large-scale social behavior, such as rapid changes in the popularity of media items like videos. Capacity planners and system designers must plan for rapid, massive changes in workloads when such social behavior is a factor. In this paper we make two contributions intended to assist in the design and provisioning of such systems.We analyze an extensive dataset consisting of the daily access counts of hundreds of thousands of YouTube videos. In this dataset, we find that there are two types of videos: those that show rapid changes in popularity, and those that are consistently popular over long time periods. We call these two types rarely-accessed and frequently-accessed videos, respectively. We observe that most of the videos in our data set clearly fall in one of these two types. For each type of video we ask two questions: first, are there relatively simple models that can describe its daily access patterns? And second, can we use these simple models to predict the number of accesses that a video will have in the near future, as a tool for capacity planning? To answer these questions we develop two different frameworks for characterization and forecasting of access patterns. We show that for frequently-accessed videos, daily access patterns can be extracted via principal component analysis, and used efficiently for forecasting. For rarely-accessed videos, we demonstrate a clustering method that allows one to classify bursts of popularity and use those classifications for forecasting

    Declarative Transport: No More Transport Protocols to Design, Only Policies to Specify

    Full text link
    Transport protocols are an integral part of the inter-process communication (IPC) service used by application processes to communicate over the network infrastructure. With almost 30 years of research on transport, one would have hoped that we have a good handle on the problem. Unfortunately, that is not true. As the Internet continues to grow, new network technologies and new applications continue to emerge putting transport protocols in a never-ending flux as they are continuously adapted for these new environments. In this work, we propose a clean-slate transport architecture that renders all possible transport solutions as simply combinations of policies instantiated on a single common structure. We identify a minimal set of mechanisms that once instantiated with the appropriate policies allows any transport solution to be realized. Given our proposed architecture, we contend that there are no more transport protocols to design—only policies to specify. We implement our transport architecture in a declarative language, Network Datalog (NDlog), making the specification of different transport policies easy, compact, reusable, dynamically configurable and potentially verifiable. In NDlog, transport state is represented as database relations, state is updated/queried using database operations, and transport policies are specified using declarative rules. We identify limitations with NDlog that could potentially threaten the correctness of our specification. We propose several language extensions to NDlog that would significantly improve the programmability of transport policies.NSF (CISE/CNF 0820138, CISE/CNS 070604, CISE/CNS 0524477, CNS/ITR 0205294, CISE/EIA RI 0202067

    On the Performance and Robustness of Managing Reliable Transport Connections

    No full text
    We revisit the problem of connection management for reliable transport. At one extreme, a pure soft-state (SS) approach (as in Delta-t [9]) safely removes the state of a connection at the sender and receiver once the state timers expire without the need for explicit removal messages. And new connections are established without an explicit handshaking phase. On the other hand, a hybrid hard-state/soft-state (HS+SS) approach (as in TCP) uses both explicit handshaking as well as timer-based management of the connection’s state. In this paper, we consider the worst-case scenario of reliable single-message communication, and develop a common analytical model that can be instantiated to capture either the SS approach or the HS+SS approach. We compare the two approaches in terms of goodput, message and state overhead. We also use simulations to compare against other approaches, and evaluate them in terms of correctness (with respect to data loss and duplication) and robustness to bad network conditions (high message loss rate and variable channel delays). Our results show that the SS approach is more robust, and has lower message overhead. On the other hand, SS requires more memory to keep connection states, which reduces goodput. Given memories are getting bigger and cheaper, SS presents the best choice over bandwidth-constrained, error-prone networks. I

    On the Universal Generation of Mobility Models

    No full text
    Abstract—Mobility models have traditionally been tailored to specific application domains such as human, military, or ad hoc transportation scenarios. This tailored approach often renders a mobility model useless when the application domain changes, and leads to wrong conclusions about the performance of protocols and applications running atop of different domains. In this work, we have proposed and implemented a mobility modeling framework based on the observation that the mobility characteristics of most mobility-based applications can be captured in terms of
    corecore